## Reconfigurability Issues of Future Massively Parallel SoCs



Prof. Dr.-Ing. Jürgen Teich Hardware/Software Co-Design University of Erlangen-Nuremberg Am Weichselgarten 3 91056 Erlangen teich@cs.fau.de

MPSoC'08, June 27, 2008

1

Jürgen Teich University of Erlangen-Nuremberg







## Inevitable Challenges of the Future Complexity $\geq$ How to map algorithms to 1000 processors or more in space and time to benefit from the massive parallelism available including memory, communication, and processing? Adaptivity $\geq$ How and to what degree should MPSoCs be equiped with support for adaptivity, resp. reconfigurability and to what degree (HW/SW, bit/word/loop/thread/process-level)? Scalability $\geq$ How to specify algorithms and generate executable programs that are able to run efficiently without change on 1,2,4, or N processors? Physical constraints Low power, performance exploitation, management overhead Jürgen Teich MPSoC'08, June 27, 2008 design 5 University of Erlangen-Nuremberg





















## **Invasive FIR-Filter** $P = INVADE(my_id, WEST, N);$ $\langle par: 0 \leq p < P ::$ $\langle seq: \ t = N/P \cdot \eta_1 + (N/P + 1) \cdot \eta_2 + 2 \cdot N/P \cdot \eta_3: 0 \leq \eta_1 < 2 \land 0 \leq \eta_2 < N/P \land 0 \leq \eta_3 < \lceil T/2 \rceil::$ a[p, t - N/P]if $\eta_1 > 0$ a[p-1, t-1]if $p > 0 \land \eta_1 = 0$ a[p, t] =a[p+1, t-1]if $\eta_3 > 0 \land p = 0 \land \eta_1 = 0$ if $\eta_3 = 0 \land p = 0 \land \eta_1 = 0$ $A_{\eta_2}$ u[p, t - N/P - 1]if $\eta_1 > 0 \land \eta_2 > 0$ u[p-1, t-2] $\text{ if } p>0 \ \land \ \eta_1=0 \land \eta_2>0 \\$ u[p+1, t-2]if $\eta_3 > 0 \land p = 0 \land \eta_1 = 0$ u[p, t] = $\wedge \eta_2 > 0$ if $\eta_3 = 0 \land p = 0 \land \eta_1 = 0$ $U_{\eta_1+2\cdot p+4\cdot \eta_3-\eta_2}$ $\wedge \eta_2 > 0$ $z[p,t] = a[p,t] \cdot b[p,t]$ $\int y[p, t-1] + z[p, t]$ if $\eta_2 > 0$ y[p,t] =z[p,t]if $\eta_2 = 0$ $\rangle\rangle$ RETREAT(); Jürgen Teich University of Erlangen-Nuremberg MPSoC'08, June 27, 2008 design 16



| 0.4 | Throughput (output samples/clock cycle) |  |  |  |  |
|-----|-----------------------------------------|--|--|--|--|
| 64  | 1.00                                    |  |  |  |  |
| 32  | 0.50                                    |  |  |  |  |
| 16  | 0.25                                    |  |  |  |  |
| 8   | 0.125                                   |  |  |  |  |
| 4   | 0.062                                   |  |  |  |  |
| 2   | 0.031                                   |  |  |  |  |
|     |                                         |  |  |  |  |





















| Case Stud                                                                                                                                                                                                                                                                        | y Res  | sults | 6       |         |             |  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|-------|---------|---------|-------------|--|
| <ul> <li>Two algorithms implemented on both architectures:         <ul> <li>Edge Detection (ED) and</li> <li>FIR Filter</li> </ul> </li> <li>16-bit fixed point arithmetic</li> <li>Implementation on a Xilinx Virtex II Pro xc2vp30 FPGA</li> <li>Synthesis results:</li> </ul> |        |       |         |         |             |  |
| Array type                                                                                                                                                                                                                                                                       | # LUTs | # FFs | # BRAMs | # MULTs | Freq. (MHz) |  |
| Dedicated ED                                                                                                                                                                                                                                                                     | 1525   | 1449  | 5       | 4       | 250         |  |
| Dedicated FIR Filter                                                                                                                                                                                                                                                             | 312    | 470   | -       | 4       | 277         |  |
|                                                                                                                                                                                                                                                                                  |        |       |         |         |             |  |
| InvasIC for<br>both algorithms                                                                                                                                                                                                                                                   | 4563   | 1493  | 8       | 4       | 150         |  |

| Case                               | e Study Resu                                                                                                    | lts                                |    |
|------------------------------------|-----------------------------------------------------------------------------------------------------------------|------------------------------------|----|
| > Size                             | of reconfiguration data                                                                                         | a:                                 |    |
|                                    | Array type                                                                                                      | Reconfiguration data siz<br>(bytes |    |
|                                    | Dedicated ED                                                                                                    | 72437                              | _  |
|                                    | Dedicated FIR Filter                                                                                            | 14489                              | _  |
|                                    | WPPA ED                                                                                                         | 144                                | _  |
|                                    | WPPA FIR Filter                                                                                                 | 40                                 |    |
| (32 Bi<br>■ F                      | sIC Reconfiguration sp<br>t reconfiguration bus @133 MH<br>Program & Interconnect F<br>Program & Interconnect E | z)<br>IR: 0.3 μs                   |    |
| Jürgen Teich<br>University of Erla | ngen-Nuremberg (                                                                                                | design MPSoC'08,<br>June 27, 2008  | 30 |























